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AN  ALGORITHM  FOR  POSITIFB  INFINITE  LEAST  SQOARE 
ESTIMATIOH  OF  PARAMETERS 

Hui  Hu 


^  Abstract 

-We  presentj^an  algorithm  for  positive  definite  least  square 
estimation  of  parameters.  This  estimation  problem  arises  from  the  PILOT 
dynamic  macro-economic  model  and  is  equivalent  to  an  infinite  convex 
quadratic  program.  It  differs  from  ordinary  least  square  estimations  in 
that  the  fitting  matrix  is  required  to  be  positive  definite.  The 
algorithm  solves  the  infinite  convex  quadratic  program  by  generating  and 
solving  a  sequence  of  ordinary  convex  quadratic  programs.  By  specifying 
a  constant,  the  algorithm  will  find  an  approximate  optimal  solution 
after  finitely  many  iterations,  or  will  tend  to  an  optimal  solution  In 
the  limit.  The  algorithm  is  generalized  to  solve  a  class  of  infinite 
convex  programs. 

Key  Words.  Least  square  estimation,  quadratic  programming,  positive 
definite  matrix. 


1.  Introduction  and  Preliminaries 

We  resolve  the  following  problem.  Given  two  sequences  of  vectors 

a*"  and  b^  in  r"  with  t  >  1,  ...  ,  L,  a  small  positive  number  c 

and  a  large  number  K  >  >  e ,  we  want  to  find  a  real  symmetric  matrix 

L  t  t  2 

X  =  such  that  ’Xa  -  b  I  is  minimized  among  all  the  real 

square  matrices  X  satisfying  conditions  (a)  and  (b): 


in  \  m 


(b)  the  smallest  eigenvalue  of  X  Is  no  less  than  e. 

This  problem  differs  from  ordinary  least  square  estimations  In  that 

the  fitting  matrix  X  Is  required  to  be  positive  definite  (Lemma  1). 

It  arises  from  the  PILOT  dynamic  macro-economic  model  designed  to  assess 

the  long  term  Impact  of  foreign  competition.  Innovation,  modernization 

and  energy  need  (Dantzlg  HI).  It  Is  a  nonlinear  optimization  problem 

with  matrix  variables  and  constraints. 

n*"  1  n  T 

Throughout  this  paper,  S  =  (x  c  R  ;  x  x  =1}  denotes  the  unit 
sphere  In  r"  and  I* I  denotes  the  Euclidean  norm.  For  a  real 
symmetric  matrix  B,  X[B)  stands  for  the  smallest  eigenvalue  of  B  and 
v[B]  a  corresponding  eigenvector  of  unit  length.  Superscripts  on 
vectors  are  used  to  denote  different  vectors,  while  subscripts  are  used 
to  denote  components  of  a  vector. 

Before  solving  this  problem,  we  state  some  lemmas  about  positive 
definite  matrices  and  real  symmetric  matrices.  These  lemmas  are  either 
obvious  or  well-known;  however,  they  play  an  Important  role  throughout 
our  discussion. 

Leaia  1.  For  any  real  symmetric  matrix  B,  the  following  are  equivalent 

(1)  B  Is  positive  definite; 

(2)  X(B],  the  smallest  eigenvalue  of  B,  Is  positive; 

T  I 

(3)  there  exists  6  >  0  such  that  u  Mu  6  for  all  u  c  S  ,  where 
s"  ^  =  {x  e  R*':  x^x  =1}  Is  the  unit  sphere  In  r'''.  □ 


T  ii*"  1 

hemuL  2.  For  any  real  syometrlc  matrix  B,  X[B]  ■  min  {u^'Bu:  u  e  S  } 
(see,  e.g.,  Wilkinson  [4],  p.98-99).  □ 

Lemma  3.  For  any  real  symmetric  matrix  B,  X[B]  Is  a  continuous 
function  of  the  elements  of  B  (see,  e.g.,  Isaacson  and  Keller  [3], 
p.l36).  □ 

To  solve  this  estimation  problem,  we  first  transform  it  Into  an 
equivalent  (vector  form)  Infinite  convex  quadratic  program. 

Given  a^,  c  r”  for  t  “  1,  ...  ,  L,  let  A  ■  (a^  •••  a^)'*' 

and  B^  -  (b|  •••  b^)**'  for  all  1*1,  ...  ,n.  Let  M  be  an  n^  x 

T  2 

block-diagonal  matrix  with  diagonal  blocks  A  A  and  E  be  an  nL  x  n 

block-diagonal  matrix  with  diagonal  blocks  A.  Let  be  the  1-th 

row  of  matrix  X  for  all  1->1,  ...  ,n  and  F(*)  be  a  bljectlon 

2 

from  to  r”  ,  Y  ■  F(X)  ■  (X,.  •••  X  ).  Then,  In  terms  of 

1  •  n  • 

vectors,  condition  (a)  becomes: 

*(l-l)n+J  ■  *1J  ■  ■'jl  ■  "(J-D.+l  ■  ' . "  •"'* 

-K  <  Y,  <  K  for  all  k  ■  I,  ...  .  n  . 

—  k  — 

X  n*  1 

By  Lemma  2,  condition  (b)  Is  equivalent  to:  u^Xu  e  for  all  u  e  S  . 
Thus,  condition  (b)  becomes: 


The  objective  function  becomes: 


I  IXa*"  -  bS^  -  I  I  (X  y  -  b*")^ 

t=»l  t-1  i-l 

=  I  I  (X^.a^  -  b[)2  -  I  1A(X,.)^  - 
i=l  t-1  i»l 

=  I  {X  A^kiX  y  -  2(B^)'^A(X  +  (B^)V} 

i=l 

=  y'^MY  -  2((bS^  •••  (b")'’^)EY  +  I  (B^)V. 

i=l 


Consequently,  the  equivalent  vector  form  optimization  problem  is: 


(IQP):  minimize  Y^MY  -  2((Bb^  •••  (b")^)EY  + 


subject  to 

(u.u^  •••  u  u^)Y  >  e  for  all  u  c  s"  ^ 
1  n  — 


Y  Y  ®  0 

\i-l)n+j  ^j-l)n+i 

£  K  for  all  i  = 


for  all  i,j 

,  2 

••• 


•  •  • 


,  « 


Remark  1. 

(1)  M  is  positive  semldef Inite  because  each  of  its  diagonal 
T 

blocks  A  A  is  positive  semldef inite.  This  implies  that  the  objective 
function  of  (IQP)  is  convex.  The  feasible  region  of  (IQP)  is 
compact  and  convex,  and  is  defined  by  an  infinite  number  of  linear 
constraints.  Moreover,  it  is  nonempty  since  Y  *  F(el)  is  a  feasible 
solution,  where  I  is  the  identity  matrix.  Therefore,  (IQP)  is  a 
feasible  infinite  convex  quadratic  program  and  optimal  solutions  for 
(IQP)  exist. 


(2)  Since  Y  -  F(X)  -  (X^ .  •••  X^^)  is  a  bljection,  F  ( •) 

exists.  Thus  X  “  F  ^(Y)  and  u^Xu  “  u^F  ^(Y)u  ■  (u.u^  •••  u  u'^)Y. 

1  n 

T  _i  T  T 

For  convenience,  we  use  u  F  (Y)u  and  (u,u  •••  u  u  )Y 

1  n 

Interchangeablely. 

We  have  shown  that  this  positive  definite  least  square  estimation 
problem  is  equivalent  to  the  infinite  convex  quadratic  program  (IQP). 
In  Section  2,  we  propose  an  algorltbm  for  solving  (IQP)  and  prove  its 
convergence.  In  Section  3,  we  present  computational  results  for 
randomly  generated  data.  Finally,  we  show  that  the  algorithm  presented 
in  Section  2  can  be  generalized  to  solve  a  class  of  infinite  convex 
programs  in  Section  4. 

2.  An  Algorithm  and  its  Convergence 

We  propose  an  algorithm  for  solving  the  infinite  quadratic  program 
(IQP).  This  algorithm  solves  (IQP)  by  generating  and  solving  a 
sequence  of  feasible  convex  quadratic  programs  (QP(k))  for  k  *  1, 

2,  ...  .  Each  (QP(k))  has  the  same  objective  function  as  that  of 
(IQP). 

Algorithm  1 

Step  1. 

Let  k:  ••  0; 

let  o  be  a  constant  such  that  e  £  a  <  <  K; 
let  (QP(k))  be  the  following  quadratic  program: 


minimize  Y^MY  -  2((bS^ 


•  •  • 


'  J  •W'-  J  ^ 


(b")''‘)EY  +  Ij^(B^)V 

subject  to 

^(l-l)n+j  “  \j-l)n+l  “  ***  ’  " 

i  1  K,  1  -  1,  ...  ,  n^. 

Step  2. 

)( 

Find  an  optimal  solution  Y  of  (QP(k)); 

let  X  -  F  (Y  ),  l.e.,  -  ’’^(l-l)n+j*  ^ . . 

calculate  X(X^]  and  v(X^l; 

If  X,[x'^]  >_  e,  go  to  Step  A. 

Step  3. 

Let  u*^  *  vtx'^l ; 

form  (QP(k+l))  by  adding  a  cut,  (u*'’)^F  ^(Y)u'^  ^  o  ,  to  (QP(k)); 
k;  »  k  +  1; 
go  to  Step  2. 

Step  A. 

If  a  >  e,  y'^ 

If  a  -  e,  y’^ 

CoMBents. 

(1)  For  any  k,  the  feasible  region  of  (QP(k))  Is  a  nonempty 
polytope  since  Y  ■  F(al)  is  a  feasible  solution.  Therefore,  optimal 
solutions  exist  for  all  (QP(k)).  Furthermore,  since  the  objective 
function  of  (QP(k))  Is  quadratic  and  convex,  there  are  finite 
algorithms  for  finding  Its  optimal  solutions. 


Is  an  approximate  optimal  solution  of  (IQP);  stop. 
Is  an  optimal  solution  of  (IQP);  stop. 
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(2)  Efficient  finite  algorithms  for  calculating  eigenvalues  and 
eigenvectors  of  a  matrix  can  be  found  in  Wilkinson  [4]. 

(3)  k  counts  the  number  of  major  iterations  (step  Z-step  3),  or 
equivalently,  the  number  of  cuts  added  before  termination.  Each  major 
iteration  can  be  processed  finitely. 

(4)  a  is  a  constant  and  e  ^  a  <  <  K.  If  o  >  e,  then  an 

approximate  optimal  solution  will  be  found  after  a  finite  number  of 

major  iterations  (see  Theorem  1).  If  o  *  e,  then  any  cluster  point  of 
0  12 

of  the  sequence  Y  ,  Y  ,  Y  ,  . . .  is  an  optimal  solution  for  (IQP)  (see 
Theorems  2  and  3). 

Theorem  1.  If  a  >  e,  then  Algorithm  1  can  find  an  approximate  optimal 
solution  for  (IQP)  after  a  finite  number  of  major  Iterations. 

Proof.  Let  H(Y)  ■=  y'^^MY  -  2((B^)’'‘’  •••  (b")’'’)EY  +  I^(B^)V  be  the 
objective  functions  of  (IQP)  and  (QP(k))  for  all  k.  Let 
C  =  {Y:  -K  <  Y^  <  K,  i  -  1,  ...  ,  n^}  x  and  G(Y,u)  -  uV^(Y)u. 

Then  G(Y,u)  is  a  continuous  function  and  C  is  a  compact  set.  Hence, 
G(Y,u)  is  uniformly  continuous  on  C,  l.e.,  for  any  6  >  0,  there  exists 
■q  >  0  such  that 

(c)  ll(Y,u)  -  (Y,u)»  <  Tj  implies  |G(Y,u)  -  G(Y,G)|  <  6  for  all 
(Y,u)  and  (Y,u)  in  C. 

In  particular,  for  6  »  a  -  e  >  0,  there  exists  t)  >  0  such  that  (c) 

If  0“  1 

holds.  If  Algorithm  1  goes  on  infinitely,  then  it  generates  u  e  S 
for  k“0,  1,  2,  ...  .  Since  s”  ^  is  compact,  for  the  ti  >  0,  there 


exist  and  in  the  sequence  such  that  lu^^  -  u^J  I  <  q. 

Without  loss  of  generality,  we  assume  that  .  Since  Algorithm  1 

does  not  stop  at  iteration  k^,  we  have: 


(d)  G(Y‘"j,u'"i)  =  (u*^i)V^(Y*^i)u‘^i  >  a; 


(e)  GCy'^J.u'^J)  =  (u*"J)V^(y‘"J)u'^J  <  e. 


However,  (d)  and  (e)  imply  that  (G(Y*^J  ,0^^^)  -  GCy'^J  ,u'^J )  |  >  a  -  e  =  6 

while  l(Y'^j,u^^)  -  (Y^J,u^J)l  <  rj,  which  contradicts  the  uniform 

continuity  of  G(Y,u)  on  C.  Therefore,  if  a  >  e,  then  Algorithm  1 

terminates  finitely.  Suppose  that  it  stops  at  iteration  k.  Then 
-Ik  k 

^[F  (Y  )]  >^  e  and  by  Lemma  2,  Y  is  a  feasible  solution  of  (IQP). 

i  T  — I  i 

However,  since  a  >  e,  the  constraints  (u  )  F  (Y)u  ^  a  for  i  =  0, 
1,  ...  ,  k-1  may  be  violated  by  certain  feasible  solutions  of  (IQP). 
We  can  not  guarantee  that  H(Y  )  <_  H(Y)  holds  for  all  feasible  Y. 

But,  if  Y  e  {Y  €  r";  u^f"'^(Y)u  >  a  for  all  u  e  s"“\  f“^(Y)^  »  F~^(Y) 
-K  Yj^  £  K  for  1*1,  ...  ,  n^}  ,  then  H(Y*^)  ^  H(Y)  is  guaranteed  to 
hold.  Therefore,  Y  is  only  an  approximate  optimal  (or  a-suboptimal) 
solution  in  the  case  a  >  e.  □ 

Remark  2.  When  a  increases,  the  number  of  cuts  added  before 
terminating  decreases,  but  the  final  objevtlve  function  value 
Increases.  For  given  data  and  choice  of  a,  if  the  algorithm  does  not 
terminate  by  a  specified  number  of  iterations,  we  can  increase  a  and 
try  again. 

Theorem  2.  If  a  e  and  Algorithm  1  stops  at  a  certain  iteration  i, 
then  Y^  is  an  optimal  solution  of  (IQP). 


» 

I 

;  Proof.  Let  H(Y)  »  y’^MY  -  2((bS^  •••  (b")^)EY  +  be  the 

I  objective  functions  of  (IQP)  and  (QP(k.))  for  all  k.  If  Algorithm  1 

I  stops  at  a  certain  iteration  i,  then  A.[X^]  e.  By  Lemma  2,  Y^  is  a 

I  feasible  solution  of  (IQP).  Next,  suppose  that  Y  is  an  arbitrary 

^  feasible  solution  of  (IQP).  Because  a  =  e,  Y  is  feasible  for  all 

(QP(k)).  In  particular,  Y  is  feasible  for  (QP(i)).  Since  Y^  is  an 
optimal  solution  of  (QP(i)),  we  have  H(Y^)  <^H(Y).  Therefore,  Y^  is 
an  optimal  solution  of  (IQP).  0 

Theorem  3.  Suppose  that  a  =  e  and  that  Algorithm  1  does  not  stop 

0  12 

finitely.  Let  the  sequence  Y  ,  Y  ,  Y  ,  ...  be  generated  by 
Algorithm  1.  Then  any  cluster  point  is  a  solution  for  (IQP). 

Ic  k 

Proof.  Let  Y  and  u  for  k=0,  1,  2,  ...  be  generated  by 

Algorithm  1  with  a  =  e.  Since  Y^,  y\  ...  are  in  a  compact  set,  there 

exist  cluster  points.  Let  Y  be  a  cluster  point  and  without  loss  of 

k 

generality  we  assume  that  =  Y.  We  claim  that  for  any 

0  <  p  <  e,  there  exists  an  integer  N(  p)  such  that  X.[F  ^(y''^))  ^  P  for 
all  k  N(P).  Let  H(Y),  G(Y,u)  and  C  be  defined  as  before.  Since 

G(Y,u)  is  uniformly  continuous  on  C,  for  6  =  e  -  p  >  0,  there  exists 
q  >  0  such  that 

(f)  II(Y,u)  -  (Y,G)I  <  q  Implies  |G(Y,u)  -  G(Y,u)|  <  6  for  all 

(Y,u)  and  (Y,u)  in  C.  | 

( 

I 

Suppose  that  the  above  claim  is  not  true.  Then  there  exist  k,  and  k  ' 

1  j  j 

(k^  <  kj)  such  that  XIf'^y’^J)]  <  p  and  «u*^i  -  u*^ j  II  <  q.  Thus, 

1 

1 
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(g)  >  e; 

(h)  G(Y*"J,u’"j)  =  (u'"J)V^(Y^J)u‘^J  =  X(F~^(y'"J)]  <  P. 

However,  (g)  and  (h)  imply  that  [gCy'^^J  -  G(  Y*'^ J , )  |  >  e  -  P  =  6 

while  ll(Y*^j,u^i)  -  (Y*^J  ,u*^J )  H  <  ti,  which  contradicts  (f).  Therefore, 
the  above  claim  is  proved.  By  the  claim  and  the  continuity  of  F  ^•) 
and  X[*]  (Lemma  3),  we  have  \[F  ^(Y)]  p  for  any  p  satisfying 

0  <  P  <  e.  Consequently,  X[F  ^(Y)l  e  and  Y  is  feasible  for 
(IQP).  Next,  let  Y  be  an  arbitrary  feasible  solution  of  (IQP).  Then 
Y  is  feasible  for  all  (QP(k)).  Because  Y  is  an  optimal  solution  of 
(QP(k))  and  the  objective  functions  of  (QP(k))  and  (IQP)  are  the 
same,  we  have  H(Y  )  H(Y)  for  k=0,  1,  2,  ...  .  It  follows  that 

H(Y)  <  H(Y)  holds  for  all  feasible  Y  and  Y  solves  (IQP).  D 

Reaark  3.  Since  a  =  e,  the  feasible  region  of  (QP(k))  contains  that 
of  (IQP).  As  the  algorithm  goes  on,  Y  becomes  more  and  more  close  to 
the  feasible  region  of  (IQP)  and  H(Y  )  tends  increasingly  to  H(Y). 

3.  Computational  Results 

We  have  coded  Algorithm  1  in  FORTRAN.  We  use  the  subroutine  QPSOL 
(from  the  Systems  Optimization  Laboratory,  Department  of  Operations 
Research,  Stanford  University)  to  solve  (QP(k))  and  the  subroutine 
F02ABF  (from  NAG  Library,  Stanford  University)  to  calculate  eigenvalues 
and  eigenvectors. 
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The  program  was  executed  on  a  DEC  20  compuCor.  All  components  of 
the  input  data  a^  and  b*"  are  lid  U(-0,5,0.5). 

First,  we  compute  one  problem  six  times  with  different  a  values 
to  demonstrate  the  influence  of  a  on  the  number  of  major  iterations 
and  on  the  final  objective  values,  see  Table  1. 

Given  data: 

L  =  8,  e  »  1.0,  K  -  10^,  n^  -  16; 

-  (-0.3052  0.1087  -0.3915  -0.4383) 

a^  -  (  0.1379  0.1707  -0.1208  0.3839) 

a^  -  (  0.2999  -0.4803  0.1790  -0.2021) 

&  =-  (-0.1334  0.1864  -0.0431  0.4557) 
a^  =  (-0.0681  -0.4627  -0.1384  0.0547) 
a^  =  (-0.4691  0.0743  0.3823  0.1650) 

a  =  (-0.2117  -0.3549  0.4991  -0.1264) 

a®  =  (-0.0865  0.0886  -0.4886  -0.3304) 

b^  =  (  0.2325  -0.1774  -0.3115  0.2133) 

b^  =  (-0.4512  -0.1078  0.0383  -0.0906) 

b®  =  (-0.0641  -0.3664  -0.1086  -0,3182) 
b^  =»  (-0.3645  -0.1941  -0.1331  -0.3830) 
b^  =  (-0.2327  -0.0301  -0.0613  0.2470) 
b®  -  (-0.3909  0.3732  -0.0953  -0.1953) 

b^  -  (-0.1478  -0.2652  -0.3996  0.3307) 

b®  -  (-0.2671  0.3283  0.0569  -0.3668) 


Table  1 


value 
of  a 

no.  of  major 
iterations 

final  objective 
function  value 

CPU  time 
(second) 

1.1 

4 

5.2713 

1.89 

1.01 

7 

4.8017 

2.77 

1.001 

13 

4.7585 

4.94 

1.0001 

14 

4.7537 

5.44 

1.00001 

15 

4.7532 

5.88 

1.000001 

16 

4.7532 

6.63 

Next,  we  solve  a  number  of  problems  in  different  dimensions,  see 
Table  2.  Again,  all  components  of  the  input  data  a^  and  b^  are  iid 
U( -0.5,0. 5). 


Table  2 


problem 
dimension 
(  n  ,  L  ) 

value  of 
a  and  c 

(  a,  e  ) 

no.  of 

major 

Itera. 

value  of 
final  obj . 
function 

CPU  time 
(second) 

(16,  6) 

(1.01,  1) 

7 

3.3775 

2.90 

(16,10) 

(1.01,  1) 

8 

5.9837 

3.23 

(36,  8) 

(1.01,  1) 

15 

9.0326 

25.18 

(36,12) 

(1.01,  1) 

17 

14.1864 

28.19 

(64,12) 

(1.01,  1) 

33 

19.1114 

219.29 

(64,18) 

(1.01,  1) 

26 

29.6196 

175.32 

I 

I 

I 


i 

\ 

j 
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4.  A  GenerallzatloD 


Herein  we  show  that  Algorithm  1  presented  In  Section  2  can  be 
generalized  to  solve  the  following  Infinite  convex  program  (ICP). 

(TCP):  minimize  f(x) 

subject  to 

g(x,u)  0  for  all  u  e  U 
X  e  S 

where  U  and  S  are  compact  convex  sets,  f(x)  Is  a  convex  function  on 
S,  g(x,u)  Is  continuous  on  S  x  U  and  Is  concave  In  x  when  u  Is 
fixed  and  convex  In  u  when  x  Is  fixed. 

Definition  (e-optlmal  solution)  A  vector  x  e  S  Is  an  e-optlmal 
solution  of  (ICP)  If  g(x,u)  2^  -e  for  all  u  €  U  and 
f(x)  <  v(ICP),  where  v(ICP)  Is  the  optimal  objective  function  value 
of  (ICP). 

Algorithm  2 

Step  1. 

Let  k:  ■  0; 

let  (CP(k))  be  the  following  convex  program: 
minimize  f(x) 


subject  to 


step  2 


If  (CP(k))  Is  infeasible,  go  to  Step  4; 
find  an  optimal  solution  x  of  (CP(k)); 
if  g(x  ,u)  0  for  all  u  e  U,  go  to  Step  5. 

Step  3. 

Vc  Ic  Ic 

Find  a  u  e  U  satisfying  g(x  ,u  )  <  0. 

form  (CP(k+l))  by  adding  a  cut  g(x,u*^)  >^0  to  (CP(k)); 

k:  =  k  +  1; 

go  to  Step  2. 

Step  4. 

(ICP)  Is  Infeasible,  stop. 

Step  5. 

Ic 

x  Is  an  optimal  solution  of  (ICP),  stop. 

Notice  that  g(x,u)  is  uniformly  continuous  on  S  x  U,  It  is  not 
hard  to  prove  the  following  theorems. 

Theorem  4  (finite  E-convergence)  For  any  e  >  0,  Algorithm  2  can  find 
an  e-optimal  solution  of  (ICP)  after  finitely  many  iterations.  □ 

Theorem  5  (convergence)  If  Algorithm  2  does  not  stop  finitely,  then  any 
cluster  point  of  the  sequence  x  for  k  ■  1,  2  ,  ...  is  an  optimal 
solution  of  (ICP).  0 
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